AITopics | San Martín Department

Collaborating Authors

San Martín Department

44af065477781e7f8a8589b14a62c489-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 00:51:20 GMT

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

South America > Peru > San Martín Department (0.04)
South America > Peru > Pasco Department (0.04)
South America > Peru > Junín Department (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (0.93)
Workflow (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.74)
(2 more...)

Add feedback

Meta-Optimization and Program Search using Language Models for Task and Motion Planning

Shcherba, Denis, Cobo-Briesewitz, Eckart, Braun, Cornelius V., Toussaint, Marc

arXiv.org Artificial IntelligenceSep-18-2025

Intelligent interaction with the real world requires robotic agents to jointly reason over high-level plans and low-level controls. Task and motion planning (TAMP) addresses this by combining symbolic planning and continuous trajectory generation. Recently, foundation model approaches to TAMP have presented impressive results, including fast planning times and the execution of natural language instructions. Yet, the optimal interface between high-level planning and low-level motion generation remains an open question: prior approaches are limited by either too much abstraction (e.g., chaining simplified skill primitives) or a lack thereof (e.g., direct joint angle prediction). Our method introduces a novel technique employing a form of meta-optimization to address these issues by: (i) using program search over trajectory optimization problems as an interface between a foundation model and robot control, and (ii) leveraging a zero-order method to optimize numerical parameters in the foundation model output. Results on challenging object manipulation and drawing tasks confirm that our proposed method improves over prior TAMP approaches.

constraint, large language model, natural language, (13 more...)

arXiv.org Artificial Intelligence

2505.03725

Country:

South America > Peru > San Martín Department (0.04)
South America > Peru > Loreto Department (0.04)
Europe > Germany > Berlin (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback

Intuitive Human-Robot Interfaces Leveraging on Autonomy Features for the Control of Highly-redundant Robots

Torielli, Davide

arXiv.org Artificial IntelligenceMay-13-2025

[...] With the TelePhysicalOperation interface, the user can teleoperate the different capabilities of a robot (e.g., single/double arm manipulation, wheel/leg locomotion) by applying virtual forces on selected robot body parts. This approach emulates the intuitiveness of physical human-robot interaction, but at the same time it permits to teleoperate the robot from a safe distance, in a way that resembles a "Marionette" interface. The system is further enhanced with wearable haptic feedback functions to align better with the "Marionette" metaphor, and a user study has been conducted to validate its efficacy with and without the haptic channel enabled. Considering the importance of robot independence, the TelePhysicalOperation interface incorporates autonomy modules to face, for example, the teleoperation of dual-arm mobile base robots for bimanual object grasping and transportation tasks. With the laser-guided interface, the user can indicate points of interest to the robot through the utilization of a simple but effective laser emitter device. With a neural network-based vision system, the robot tracks the laser projection in real time, allowing the user to indicate not only fixed goals, like objects, but also paths to follow. With the implemented autonomous behavior, a mobile manipulator employs its locomanipulation abilities to follow the indicated goals. The behavior is modeled using Behavior Trees, exploiting their reactivity to promptly respond to changes in goal positions, and their modularity to adapt the motion planning to the task needs. The proposed laser interface has also been employed in an assistive scenario. In this case, users with upper limbs impairments can control an assistive manipulator by directing a head-worn laser emitter to the point of interests, to collaboratively address activities of everyday life. [...]

laser-based interaction and behavior tree, machine learning, shared locomanipulation motion generation, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.15167/torielli-davide_phd2024-02-20

2505.07668

Country:

North America > United States (0.14)
South America > Peru > Loreto Department (0.13)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
(5 more...)

Genre:

Summary/Review (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Information Technology (1.00)
Government (0.93)
Materials (0.92)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.87)
(2 more...)

Add feedback

Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference

Ye, Shengyuan, Du, Jiangsu, Zeng, Liekang, Ou, Wenzhong, Chu, Xiaowen, Lu, Yutong, Chen, Xu

arXiv.org Artificial IntelligenceMay-27-2024

Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge, such as voice assistant in smart home. Traditional deployment approaches offload the inference workloads to the remote cloud server, which would induce substantial pressure on the backbone network as well as raise users' privacy concerns. To address that, in-situ inference has been recently recognized for edge intelligence, but it still confronts significant challenges stemming from the conflict between intensive workloads and limited on-device computing resources. In this paper, we leverage our observation that many edge environments usually comprise a rich set of accompanying trusted edge devices with idle resources and propose Galaxy, a collaborative edge AI system that breaks the resource walls across heterogeneous edge devices for efficient Transformer inference acceleration. Galaxy introduces a novel hybrid model parallelism to orchestrate collaborative inference, along with a heterogeneity-aware parallelism planning for fully exploiting the resource potential. Furthermore, Galaxy devises a tile-based fine-grained overlapping of communication and computation to mitigate the impact of tensor synchronizations on inference latency under bandwidth-constrained edge environments. Extensive evaluation based on prototype implementation demonstrates that Galaxy remarkably outperforms state-of-the-art approaches under various edge environment setups, achieving up to 2.5x end-to-end latency reduction.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2405.17245

Country:

Asia > China > Guangdong Province > Guangzhou (0.05)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Oceania > Australia > Victoria > Bass Strait (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger

Bu, Zhiqi, Wang, Yu-Xiang, Zha, Sheng, Karypis, George

arXiv.org Artificial IntelligenceOct-3-2023

Per-example gradient clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models. The choice of clipping threshold R, however, is vital for achieving high accuracy under DP. We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune R for any DP optimizers, including DP-SGD, DP-Adam, DP-LAMB and many others. The automatic variants are as private and computationally efficient as existing DP optimizers, but require no DP-specific hyperparameters and thus make DP training as amenable as the standard non-private training. We give a rigorous convergence analysis of automatic DP-SGD in the non-convex setting, showing that it can enjoy an asymptotic convergence rate that matches the standard SGD, under a symmetric gradient noise assumption of the per-sample gradients (commonly used in the non-DP literature). We demonstrate on various language and vision tasks that automatic clipping outperforms or matches the state-of-the-art, and can be easily employed with minimal changes to existing codebases.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2206.07136

Country:

South America > Peru > San Martín Department (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > Dominican Republic (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DDSP: Differentiable Digital Signal Processing

Engel, Jesse, Hantrakul, Lamtharn, Gu, Chenjie, Roberts, Adam

arXiv.org Machine LearningJan-14-2020

A BSTRACT Most generative models of audio directly generate samples in one of two domains: time or frequency. While sufficient to express any signal, these representations are inefficient, as they do not utilize existing knowledge of how sound is generated and perceived. A third approach (vocoders/synthesizers) successfully incorporates strong domain knowledge of signal processing and perception, but has been less actively researched due to limited expressivity and difficulty integrating with modern auto-differentiation-based machine learning methods. In this paper, we introduce the Differentiable Digital Signal Processing (DDSP) library, which enables direct integration of classic signal processing elements with deep learning methods. Focusing on audio synthesis, we achieve high-fidelity generation without the need for large autoregressive models or adversarial losses, demonstrating that DDSP enables utilizing strong inductive biases without losing the expressive power of neural networks. Further, we show that combining interpretable modules permits manipulation of each separate model component, with applications such as independent control of pitch and loudness, realistic extrapolation to pitches not seen during training, blind dereverberation of room acoustics, transfer of extracted room acoustics to new environments, and transformation of timbre between disparate sources. In short, DDSP enables an interpretable and modular approach to generative modeling, without sacrificing the benefits of deep learning. The library is publicly available 1 and we welcome further contributions from the community and domain experts. 1 I NTRODUCTION Neural networks are universal function approximators in the asymptotic limit (Hornik et al., 1989), but their practical success is largely due to the use of strong structural priors such as convolution (Le-Cun et al., 1989), recurrence (Sutskever et al., 2014; Williams & Zipser, 1990; Werbos, 1990), and self-attention (V aswani et al., 2017). These architectural constraints promote generalization and data efficiency to the extent that they align with the data domain. From this perspective, end-to-end learning relies on structural priors to scale, but the practitioner's toolbox is limited to functions that can be expressed differentiably. Here, we increase the size of that toolbox by introducing the Differentiable Digital Signal Processing (DDSP) library, which integrates interpretable signal processing elements into modern automatic differentiation software (TensorFlow). While this approach has broad applicability, we highlight its potential in this paper through exploring the example of audio synthesis.

artificial intelligence, machine learning, synthesizer, (19 more...)

arXiv.org Machine Learning

2001.04643

Country:

South America > Peru > San Martín Department (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Genre: Research Report (0.50)

Industry:

Media > Music (0.96)
Leisure & Entertainment (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback